Specification and execution of real-time streaming applications

ABSTRACT

Systems and methods to specify and execute real-time streaming applications are provided. The method includes specifying an application topology for an application including spouts, bolts, connections, a global hash table, and a topology manager. Each spout receives input data and each bolt transforms the input data, the global hash table allows in memory communication between each spout and bolt to others of the spouts and the bolts. The topology manager manages the application topology. The method includes compiling the application into a shared or static library for applications, and exporting a special symbol associated with the application. The runtime system can be used to retrieve the application topology from the shared or static library based on the special symbol and execute the application topology on a single node or distribute across multiple nodes.

RELATED APPLICATION INFORMATION

This application claims priority to U.S. Provisional Patent ApplicationNo. 62/816,426, filed on Mar. 11, 2019, which is incorporated byreference herein in its entirety.

BACKGROUND Technical Field

The present invention relates to streaming applications and moreparticularly specification and execution of real-time streamingapplications on various platforms.

Description of the Related Art

Streaming and real-time computation systems provide capabilities towrite distributed, streaming applications. Streaming media refers tomultimedia that is constantly received by and presented to an end-userwhile being delivered by a provider. Streaming data continues to gain inimportance because of the growing number of data sources thatcontinuously produce and offer data. These include, for example, theInternet of Things, multimedia, click streams, as well as device andserver logs.

SUMMARY

According to an aspect of the present principles, a method is providedto specify and execute real-time streaming applications. The methodincludes specifying an application topology for an application includingspouts, bolts, connections, a global hash table, and a topology manager.Each spout receives input data and each bolt transforms the input data,the global hash table allows in memory communication between each spoutand bolt to others of the spouts and the bolts. The topology managermanages the application topology. The method includes compiling theapplication into a shared library for applications, and exporting aspecial symbol associated with the application. The runtime system canbe used to retrieve the application topology from the shared librarybased on the special symbol and execute the application topology on asingle node or distribute across multiple nodes.

According to another aspect of the present principles, a system isprovided to specify and execute real-time streaming applications. Thesystem includes a processor device operatively coupled to a memorydevice, the processor device being configured to specify an applicationtopology for an application including spouts, bolts, connections, aglobal hash table, and a topology manager. Each spout receives inputdata and each bolt transforms the input data, the global hash tableallows in memory communication between each spout and bolt to others ofthe spouts and the bolts. The topology manager manages the applicationtopology. The processor device is configured to compile the applicationinto a shared library for applications, and exporting a special symbolassociated with the application. The runtime system can be used toretrieve the application topology from the shared library based on thespecial symbol and execute the application topology on a single node ordistribute across multiple nodes.

These and other features and advantages will become apparent from thefollowing detailed description of illustrative embodiments thereof,which is to be read in connection with the accompanying drawings.

BRIEF DESCRIPTION OF DRAWINGS

The disclosure will provide details in the following description ofpreferred embodiments with reference to the following figures wherein:

FIG. 1 is a block diagram showing an exemplary processing system, inaccordance with an embodiment of the present invention;

FIG. 2 is a block diagram illustrating an application topology for astreaming system, in accordance with an embodiment of the presentinvention;

FIG. 3 is a block diagram illustrating a streaming component forreceiving and handling input data streams, in accordance with anembodiment of the present invention;

FIG. 4 is a block diagram illustrating a streaming component forprocessing and transforming input data streams, in accordance with thepresent principles;

FIG. 5 is a flowchart illustrating a process of application execution,in accordance with an embodiment of the present invention;

FIG. 6 is a flowchart illustrating a procedure for processing a requestfrom a streaming component, in accordance with an embodiment of thepresent invention;

FIG. 7 is a block diagram illustrating a streaming platform deviceimplementing a runtime system for real-time streaming applications, inaccordance with the present principles; and

FIG. 8 is a block diagram illustrating a streaming system architecture,in accordance with an embodiment of the present invention.

DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS

In accordance with the present principles, systems and methods areprovided to/for implementing a stream programming model and anunderlying runtime system, capable of efficiently handling streamingapplications, including streaming video applications, on a distributed,multi-platform environment.

In one embodiment, the system provides support multi-platform deploymentincluding (a) multiple platforms (e.g., edge processing device, server,or cloud; Windows™/Linux™/Android™), and (b) multiple computingarchitectures (server/cloud processing only, or a combination of edgeprocessing and server/cloud processing).

In one embodiment, the system provides support for a specializedprogramming model and built-in support for writing streamingvideo/non-video applications with the ability to run for finite durationor define termination condition and cleanly terminate, if required.Example embodiments are scalable across variety of hardware platforms(edge, server of cluster/cloud) and support efficient partitioning ofstreaming workload between edge-devices and centralized servers/cloud.

Embodiments described herein may be entirely hardware, entirely softwareor including both hardware and software elements. In a preferredembodiment, the present invention is implemented in software, whichincludes but is not limited to firmware, resident software, microcode,etc.

Embodiments may include a computer program product accessible from acomputer-usable or computer-readable medium providing program code foruse by or in connection with a computer or any instruction executionsystem. A computer-usable or computer readable medium may include anyapparatus that stores, communicates, propagates, or transports theprogram for use by or in connection with the instruction executionsystem, apparatus, or device. The medium can be magnetic, optical,electronic, electromagnetic, infrared, or semiconductor system (orapparatus or device) or a propagation medium. The medium may include acomputer-readable storage medium such as a semiconductor or solid-statememory, magnetic tape, a removable computer diskette, a random-accessmemory (RAM), a read-only memory (ROM), a rigid magnetic disk and anoptical disk, etc.

Each computer program may be tangibly stored in a machine-readablestorage media or device (e.g., program memory or magnetic disk) readableby a general or special purpose programmable computer, for configuringand controlling operation of a computer when the storage media or deviceis read by the computer to perform the procedures described herein. Theinventive system may also be considered to be embodied in acomputer-readable storage medium, configured with a computer program,where the storage medium so configured causes a computer to operate in aspecific and predefined manner to perform the functions describedherein.

A data processing system suitable for storing and/or executing programcode may include at least one processor coupled directly or indirectlyto memory elements through a system bus. The memory elements can includelocal memory employed during actual execution of the program code, bulkstorage, and cache memories which provide temporary storage of at leastsome program code to reduce the number of times code is retrieved frombulk storage during execution. Input/output or I/O devices (includingbut not limited to keyboards, displays, pointing devices, etc.) may becoupled to the system either directly or through intervening I/Ocontrollers.

Network adapters may also be coupled to the system to enable the dataprocessing system to become coupled to other data processing systems orremote printers or storage devices through intervening private or publicnetworks. Modems, cable modem and Ethernet cards are just a few of thecurrently available types of network adapters.

FIG. 1 is a block diagram showing an exemplary processing system 100, inaccordance with an embodiment of the present invention. The processingsystem 100 includes a set of processing units (e.g., CPUs) 101, a set ofGPUs 102, a set of memory devices 103, a set of communication devices104, and set of peripherals 105. The CPUs 101 can be single ormulti-core CPUs. The GPUs 102 can be single or multi-core GPUs. The oneor more memory devices 103 can include caches, RAMs, ROMs, and othermemories (flash, optical, magnetic, etc.). The communication devices 104can include wireless and/or wired communication devices (e.g., network(e.g., WIFI, etc.) adapters, etc.). The peripherals 105 can include adisplay device, a user input device, a printer, an imaging device, andso forth. Elements of processing system 100 are connected by one or morebuses or networks (collectively denoted by the figure reference numeral110).

In an embodiment, memory devices 103 can store specially programmedsoftware modules to transform the computer processing system into aspecial purpose computer configured to implement various aspects of thepresent invention. In an embodiment, special purpose hardware (e.g.,Application Specific Integrated Circuits, Field Programmable Gate Arrays(FPGAs), and so forth) can be used to implement various aspects of thepresent invention.

In an embodiment, memory devices 103 store program code for implementingone or more of the following: an application topology 170, a runtimesystem 180, global hash table 190, etc. The application topology 170includes any application implemented by the streaming system as atopology, which is an arrangement of spouts and bolts, for example asdescribed with respect to FIG. 2 herein below. The runtime system 180takes care of running streaming applications (video and non-video),built using the programming models described herein, on a variety ofplatforms. The global hash table 190 maintains a common, in-memory,shared storage area, accessible to all instances of spouts and/or bolts,as described herein below. The global hash table 190 allows in-memorycommunication between spouts and bolts, irrespective of their order inthe application topology 170.

The processing units 101 decide at runtime, based on the node(s)information in runtime-request, whether to deploy and execute theapplication topology within a single process consisting of multiplethreads on a single node or deploy and execute the application topologyusing multiple processes distributed across multiple nodes. Low-level,topology-aware, inter-process communication mechanism is used totransfer data items/tuples between spouts and bolts.

Of course, the processing system 100 may also include other elements(not shown), as readily contemplated by one of skill in the art, as wellas omit certain elements. For example, various other input devicesand/or output devices can be included in processing system 100,depending upon the particular implementation of the same, as readilyunderstood by one of ordinary skill in the art. For example, varioustypes of wireless and/or wired input and/or output devices can be used.Moreover, additional processors, controllers, memories, and so forth, invarious configurations can also be utilized. These and other variationsof the processing system 100 are readily contemplated by one of ordinaryskill in the art given the teachings of the present invention providedherein.

Moreover, it is to be appreciated that various figures as describedbelow with respect to various elements and steps relating to the presentinvention that may be implemented, in whole or in part, by one or moreof the elements of system 100.

Referring now to FIG. 2, a block diagram of an application topology 200for a streaming system is illustratively depicted in accordance with anembodiment of the present invention. Although a particular number ofeach type of component and/or layer of the application topology isillustrated, it should be understood that more or fewer of eachcomponent and/or layer.

As shown in FIG. 2, the application topology 200 includes componentsreferred to as input spouts 210 (shown as 210-1 to 210-n), and bolts220. Topology can be implemented as a Directed Acyclic Graph (DAG). Thecomputational work in the application topology 200 is delegated to thesedifferent types of components, each of which is responsible for aspecific processing task. An input data stream is handled by a componentreferred to herein as spouts 210, which passes the data stream tocomponents called bolts 220. As illustrated in FIG. 2, an applicationtopology 200 can include different layers of bolts 220 (bolt layers 1 to3 (230), and output bolt layer 280, by way of non-limiting example).Each bolt 220 processes and transforms the input data stream receivedfrom previous spout 210/bolt 220 in a predetermined manner and passesthe transformed data items/tuples to the successive bolt(s) 220 in thechain. Tuples refer to a data structure that is an immutable, orunchangeable, ordered sequence of elements. In a relational database, atuple is one record. Any streaming application is written as a topology,which is an arrangement of components (for example, spouts 210 and oneor more chains of bolt 220 components) and connections between them.

The example embodiments provide and implement a programming model thatallows the user to define, declare various spouts 210 and bolts 220, andalso create multiple instances of spouts 210/bolts 220, so that thespecific task that the spout 210/bolt 220 performs, can be done inparallel on multiple data items/tuples.

Spouts 210 and bolts 220 within a topology communicate using one of thethree different types of connections offered by the programming model,for example shuffle 250 (denoted in FIG. 2 by a broken line), tag 260(denoted in FIG. 2 by a broken line with intermittent dots) andbroadcast 270 (denoted in FIG. 2 by solid lines with an arrow)connections.

Each shuffle connection 250 takes a tuple from the producer component(spout 210/bolt 220) and sends the tuple to a randomly chosen consumer(bolt 220), ensuring that each instance of the consumer (bolt 220)receives about the same number of tuples.

Each tag connection 260 allows a user to control how tuples are sent tobolts 220, based on one or more tags (fields) in a tuple. Tuples withthe same tags are always guaranteed to be sent to the same bolt 220.

Each broadcast (270) connection sends a tuple to all instances of allthe receiving bolts 220.

A filter can be applied on any of the above described connections(shuffle 250, tag 260 and broadcast 270), so that data items/tuples canonly be sent from the producer (spout 210/bolt 220) to the consumer(bolt 220) when the condition specified by the filter function issatisfied. If the condition specified by the filter function is notsatisfied by the particular data item/tuple, then that data item/tupleis not passed on to the successive bolt(s) 220 in the chain.

Parallelism within a topology is specified at the granularity of (spouts210/bolts 220). Each (spout 210/bolt 220) has a unique name, and theuser can specify the number of parallel instances of the (spout 210/bolt220) to be used. In FIG. 2, by way of non-limiting illustration, thereare two instances 225 of bolt-I (220-I, shown as I₁ and I₂), bolt-A(220-A, shown as A₁ and A₂) and bolt-E (220-E, shown as E₁ and E₂);single instance of bolt-J (220-J, shown as J₁), bolt-K (220-K, shown asK₁), bolt-D (220-D, shown as D₁), bolt-F (220-F, shown as F₁) and outputbolt-O (220-O, shown as O₁); and n instances of bolt-B (220-E, shown asB₁ to B_(n)).

Each spout 210/bolt 220 can have multiple sources of input. For example,Bolt-B in FIG. 2 has two sources of input, e.g., bolt-J (220-J) andbolt-K (220-J). Also, each spout 210/bolt 220 can pass on datastream/tuple to multiple bolts with different types of connections. Forexample, bolt-K (220-K) uses shuffle (250) connection to communicatewith bolt-B and uses broadcast (270) connection to communicate withbolt-C (220-C).

According to example embodiments, topology 200 can have one or morespouts 210 and generally has an output bolt 220-O, which provides thefinal output from the streaming application. In FIG. 2, bolt-O 220-O isthe output bolt. The systems described herein can provide spouts 210 andbolts 220 based on common usage, specifications and/or requirements ofparticular applications, etc., which applications can directly use, aspart of its topology.

The example embodiments described herein implement a runtime system thathandles creation, connection of spouts 210 and bolts 220, starting theexecution of spouts 210 and bolts 220 and controlling any termination,if required. According to example embodiments, the runtime system canalso provide a global hash table to manage any common, in-memory,storage across spouts 210 and bolts 220, which can be required and/orrequested by certain applications.

An example of usage of global hash table 190, when a particular bolt 220needs to load data from external system, the bolt 220 can maintain theloading status, e.g., in-progress, complete, incomplete, error, etc., inthe global hash table 190, and other spout 210/bolt 220 can view thisstatus and implement appropriate logic, if necessary. Any such commoninformation, which can be useful for other spouts 210/bolts 220 can bemaintained in the global hash table 190. Also, global hash table 190 canbe implemented as the only way to deliver information to anotherupstream spout 210/bolt 220 (for example, spout 210/bolt 220 which is inthe earlier stage of the topology than the current spout 210/bolt 220).The global hash table 190 avoids the alternative of creating a cycle inthe topology, which might lead to a deadlock condition.

Components within an application can run on a single machine or can bedistributed across multiple machines. The communication betweencomponents is managed by the underlying runtime system. The exampleembodiments provide a stream programming model and an underlying runtimesystem, capable of efficiently handling streaming applications,including streaming video applications, on a distributed, multi-platformenvironment.

The example embodiments implement a streaming system, that provides aspecialized programing model for building streaming video/non-videoapplications and ease of use by providing simple application programminginterfaces (APIs) to develop new applications and support open-sourceprogramming model for video/non-video stream processing.

Two APIs that every spout and bolt needs to implement are setup( ) andexecute( ). Logic within setup( ) is run only one time at the time ofcreation of spout/bolt, while logic within execute( ) is run for everyinput tuple received from another spout/bolt. At Topology level,following APIs are provided by the programming model:

Create a new spout: addSpout<Spout Class>(“spout-name”,“<one-or-more-spout-args>”, parallelism);

Create a new bolt: addBolt<Bolt class>(“bolt-name”,“<one-or-more-bolt-args>”, parallelism);

Add shuffle connection: addShuffleConnection(“<source-spout-or-bolt>”,“<destination-spout-or-bolt>”);

Add tag connection: addTagConnection(“<source-spout-or-bolt>”,“<destination-spout-or-bolt>”, “tag”);

Add broadcast connection:addBroadcastConnection(“<source-spout-or-bolt>”,“<destination-spout-or-bolt>”).

Any application topology 200 can be constructed using the above APIs andeach spout 210 or bolt 220 within the application topology 200implements the setup( ) and execute( ) functions.

The example embodiments achieve high performance (including speed)through optimized low-level communication and in-memory computing. Thisspeed can enable real-time operation of video/non-video streaminganalytics solutions. The example embodiments provide portability thatallow users to write each application once and run the application onmultiple platforms (e.g., edge processing device, server, or cloud;Windows™/Linux™/Android™), and multiple computing architectures (e.g.,server/cloud processing only, or a combination of edge processing andserver/cloud processing, etc.).

The example embodiments provide scalability. The example embodimentsallow a user to scale the application to use available resources on theedge, server or cluster/cloud, etc., and support efficient partitioningof streaming workload between edge-devices and centralizedservers/cloud.

According to example embodiments, the system can provide at-most onceprocessing semantics (a data item in an input stream is called a tuple,which is processed at most once).

The example embodiments implement monitoring. The example embodimentsprovide an ability to gather and present low-level system/performancemetrics. The example embodiments provide an ability to run streamingapplication for a finite duration. The example embodiments furtherprovide an ability to define conditions for termination of streamingapplication and cleanly terminate the application, when the conditionsare met.

Referring now to FIG. 3, a block diagram of a spout 210 isillustratively depicted in accordance with an embodiment of the presentinvention.

As shown in FIG. 3, spout 210 can include different types of spouts thatcontrol the input of data to the topology. For example, a time out spout(TimeoutSpout) 310 can be used to invoke and emit a user-defined dataitem/tuple at periodic time interval. An asynchronous messaging tuplereceiver 320 (for example, a ZeroMQ™ receiver, such as aZMQTupleReceiverSpout) can be used to receive a data item/tuple over anasynchronous messaging (for example, ZeroMQ™) message queue. Anasynchronous messaging video receiver 330 (for example, a ZeroMQ™receiver, such as a ZMQVideoReceiverSpout) can be used to receive anddecode data items/tuples containing video frames over an asynchronousmessaging (for example, ZeroMQ™) message queue.

Referring now to FIG. 4, a block diagram of a bolt 220 is illustrativelydepicted in accordance with an embodiment of the present invention.

As shown in FIG. 4, bolt 220 can include different types of bolts 220that process and transform the input data stream received from previousspout 210/bolt 220 in a predetermined manner and passes the transformeddata items/tuples to the successive bolt(s) 220 in the chain.

For example, asynchronous messaging tuple publisher 410 (for example,ZMQTuplePublisherBolt) can be used to publish any data item/tuple overan asynchronous message queue (for example, ZeroMQ™, Kafka™, RabbitMQ™,Apache ActiveMQ™, etc.). Filter bolt 420 can be used to filter certaindata items/tuples and conditionally emit them to successive bolt(s) inthe chain, based on the condition specified in the filter. For example,the condition can include processing the input tuple only if it“contains” or “does not contain” a specific key variable or value (forexample, “X”). In another example, the condition can include processingthe input tuple only if the value of the key “X” in the input tuple is“Y”. Typed bolt 430 can be used when specific input and output data typeneeds to be used within the bolt 220. In contrast to instances in whichthe input tuple or the output tuple is free form, the input tuple andoutput tuple for Typed bolt 430, has a specific number and specificnames of keys. Tuple windowing bolt 440 can be used when dataitems/tuples need to be aggregated over a certain window size andemitted to the successive bolt(s) 220 in the chain, when the window sizereaches the specified limit. An example of the window can include:waiting until a predetermined number (for example “x”) of frames arereceived and then processing and applying the logic on these framesaggregated within the window. This can be implemented when the videoanalytics application requires multiple frames to process rather thanact on a single frame.

The example embodiments can provide (for example, commonly needed,specialized) components (spouts 210 and bolts 220), which can directlybe used as part of the application topology. Although particular spouts210 and bolts 220 are described by way of non-limiting example, itshould be understood that additional spouts 210 and bolts 220 can beprovided consistent with the embodiments herein. The spouts 210 andbolts 220 are provided for convenience for the user. Any generic spouts210 or bolts 220 can be implemented by the user. The user can providethe logic within the setup( ) and execute( ) functions for spouts 210and/or bolts 220.

The example embodiments can run streaming applications, built using theprogramming model. The example embodiments can efficiently partitionstreaming workload between edge-devices and centralized servers/cloud.Some spouts 210/bolts 220 can run on the edge while others can run onthe centralized server/cloud. The example embodiments can build and runstreaming video and non-video applications on parallel, distributedsystem architecture. The example embodiments provide at-most onceprocessing semantics for data items/tuples.

The example embodiments can gather and present system/performancemetrics for further analysis. The system/performance metrics aregathered on a per spout 210/bolt 220 basis. For example, how much timehas the spout 210/bolt 220 been up and running, how much CPU time has itconsumed, how much memory has it consumed, number of tuples receivedfrom other specific spout 210/bolt 220, total number of tuples receivedfrom all spouts 210/bolts 220, number of tuples sent to other specificbolt 220 in the chain, total number of tuples sent to all bolts 220 inthe chain, total size (in bytes) of data received/sent, etc.

The example embodiments implement a system and method to define anddeclare spouts 210 and bolts 220. The example embodiments can define anddeclare different types of connections between spouts 210 and bolts 220.The example embodiments can specify and execute an arrangement of spouts210 and bolts 220. The example embodiments specify and execute initial,one-time setup required by spouts 210 and bolts 220. The exampleembodiments specify and execute specific task/function that the spout210/bolt 220 would execute on the continuous input data item/tuple,which is received by the spout 210/bolt 220 as part of the streamingapplication. The example embodiments can specify, create and executemultiple instances of spouts 210 and bolts 220 for parallel execution ofspecific tasks to be performed by the spout 210/bolt 220. The exampleembodiments implement a system and method to check and run the streamingapplication only if at least one spout 210 is present in the topology.The example embodiments map task for specific spout 210/bolt 220instances to low-level executors.

According to example embodiments, the system and method creates multipleexecutors, assigns tasks to executors and manages parallel execution ofexecutors. The example embodiments optimize low-level communicationbetween executors on same machine or across different machines. Theexample embodiments increase speed of execution by using in-memorycomputing. The example embodiments can control communication of dataitems/tuples between executors by using different types of connections,for example, shuffle (250), tag (260) and broadcast (270). The exampleembodiments implement a system and method to remember/cache anaddress/destination where data item/tuple with a specific tag isdirected/required (for example, needs) to be sent. The exampleembodiments can filter and selectively emit/pass data items tosuccessive bolt(s) 220 in the chain, based on a user-defined function.The example embodiments provide for all spouts 210 and bolts 220 tomaintain a common, in-memory, shared storage area (global hash table),where each of them can read/modify/write various entries.

According to example embodiments, the system and method provides thestreaming application for a user-defined finite duration. The exampleembodiments can terminate a streaming application on reception of ageneric signal used to terminate a program (for example, SIGTERMsignal). The example embodiments can cleanly complete, stop and removeany spout 210 from the streaming application. The example embodimentscan cleanly stop and tear down the streaming application from within anyspout 210/bolt 220. The example embodiments automatically stop and teardown the streaming application when none of the spout 210/bolt 220 arerunning.

Referring now to FIG. 5, a block diagram of a background separation andanomaly detection system 200 is illustratively depicted in accordancewith an embodiment of the present invention.

Runtime system 550 can be implemented using general purpose programminglanguage (for example, C++) in a manner with increased efficiency incontrast to other streaming systems that require a virtual machine (JVMor others), which adds overhead in processing time, while the runtimesystem 550 can run on a single-tenant physical server (for example,bare-metal) and is hence more efficient. The runtime system 550 can beexecuted without use of any Virtual Machine (VM), in contrast to otherstreaming runtime systems. This reduces any overhead that might incur byrunning in a VM.

Each application using the streaming programming model can be compiledinto a shared library (for example, an application topology library).This shared library can be used by the runtime system 550 to execute theapplication. The runtime system 550 can be implemented using anapplication topology 170 from a shared library, such as applicationtopology 200, described herein above with respect to FIG. 2. As shown inFIG. 5, an application execution procedure is implemented in thefollowing manner.

The application execution starts (501) with the application topologylibrary 170. Each application, which can be built using the streamingprogramming model described herein, is compiled into a shared libraryand exports special symbols (with topology name), which the runtimesystem 550 uses during application execution.

At step 1 (load library 502), the runtime system 550 can load thelibrary. Application library 170 is provided as input to the runtimesystem 550, along with the topology name and any application-relatedconfiguration. This application library 170 is initially loaded by theruntime system 550 and special symbol exported by the shared library(with topology name) can be used to obtain a handle to the applicationtopology.

At step 2 (create topology 504): the runtime system 550 createstopology. For example, once the handle to the topology is obtained, theactual topology including the spouts 210, bolts 220 and connections(250, 260, 270) between them, is retrieved and created.

At step 3 (spout present? 506): after creation of the topology, theruntime system 550 checks if at least one spout 210 is present in theapplication topology (spout present?). Spouts 210 are the startingpoint, where input data stream is ingested (for example, received,entered, etc.) in the application.

At step 4 (end 508): If there are no spouts 210 (spout present 506?-NO),there is no input data coming into the topology, which the runtimesystem 550 checks and exits (end).

At step 5 (create executors 510): If there is at least one spout 210present in the topology (spout present 506?-YES), the runtime system 550calculates and creates the total number of executors required. This isthe total number of spout 210/bolt 220 instances to be created. Eachspout 210/bolt 220 instance is handled by a separate executor.

At step 6 (assign tasks 512): After the executors are created, the taskassociated with each spout 210/bolt 220 instance is assigned to anexecutor, such that the required number of instances of spout 210/bolt220 are created and their tasks are assigned to individual executors.

At step 7 (setup connections 514): Once all executors are created andtask is assigned to each executor, the necessary connections are setupbetween various executors (instances of spouts 210 and bolts 220) as perthe connections mentioned in the topology 200, described herein abovewith respect to FIG. 2.

At step 8 (initiate setup of each executor 516): Each executor handles aparticular instance of spout 210/bolt 220. These instances of spout210/bolt 220 can have an initial, one-time setup that is done by theruntime system 550, before the actual execution starts. For example,logic within setup( ) can be run at this time (this process can be runonly one-time).

At step 9 (start execution of bolts and then spouts 518): After theinitial, one-time setup of each executor is done, the actual executionis started by the runtime system. First, all the bolt 220 instances arestarted followed by all the spout 210 instances. By following this orderof execution, the runtime system 550 ensures that when the spout(s) 210emit data items/tuples to process, the bolt(s) 220 instances are readyand already waiting to process them. For example, logic within execute() can be run at this time for every input tuple (this process can keeprepeating).

At step 10 (setup termination handler 520): Once the executors havestarted running, the runtime system 550 sets up a termination handler(for example, a SIGTERM handler) to terminate the topology whentermination signal (for example, a SIGTERM signal) is received. Thetermination signal tells a process when to terminate and, in someinstances, can allow a program to implement associated processes (forexample, tidy up) as appropriate before fully terminating. For example,the system 550 can save state information, delete temporary files,restore previous terminal modes, etc.

At step 11 (Spouts and bolts running? 522): After the terminationhandler is setup, the runtime system 550 keeps monitoring the status ofexecutors (spouts 210 and bolts 220) and continues running theapplication until the executors (spouts 210 and bolts 220) stop running.

At step 12 (end 524): The runtime system 550 exits if the executors(spouts 210 and bolts 220) have stopped running (step 11, NO).

At step 13 (time for metrics measurement? 526): While the executors(spouts 210 and bolts 220) are running (step 11, YES), the runtimesystem 550 periodically checks if it is time to collectsystem/performance metrics.

At step 14 (measure system/performance metrics 528): Low-levelsystem/performance metrics are collected by the runtime system 550,whenever the time interval between the previous collection and currenttime has reached or exceeded the pre-defined time limit (step 13, YES).

At step 15 (received request from spout/bolt 530): If it is not yet timeto collect system/performance metrics, then the runtime system 550continues to check if it has received any request from any instance ofthe spout 210/bolt 220 (step 13, NO). If no request is received, theruntime system 550 goes back to step 11, where the runtime system 550monitors the status of executors (spouts 210 and bolts 220).

At step 16 (process request 532): If any request is received from aspout 210/bolt 220, the runtime system 550 processes the request andgoes back to step 11, where the runtime system 550 monitors the statusof executors (spouts 210 and bolts 220).

The same runtime system 550 can be used to run the application topology200 on a single node or distribute across multiple nodes with thedeployment strategy decided at the time of deployment. Appropriatelow-level implementation can be decided by the runtime system 550automatically at the time of deployment. In contrast to other streamingsystems, streaming system 550 distinguishes and optimizes low-levelimplementation of the application topology 200 based on whether thedeployment is on single node or across multiple nodes.

In case of single node deployment, the application topology 200 can beimplemented in a single process, comprising of multiple low-levelthreads. Executors to manage spout 210 and bolt 220 tasks can beimplemented as low-level threads and a special thread (calledtopology-manager) can be created to manage creation, execution, andtermination of the application topology 200.

In case of multi-node deployment, the application topology 200 can beimplemented as multiple processes. Executors to manage spout 210 andbolt 220 tasks can be implemented as processes and a special process(called a topology-manager) can be created to manage creation, executionand termination of the application topology 200.

Based on the underlying implementation, for example, whether they areimplemented as threads of the same process or as separate processes,runtime system 550 can automatically choose the optimized topology-awareinter-process communication mechanism between spouts 210 and bolts 220.The streaming system 550 implements topology-aware optimizedcommunication between components of the application topology 200.

By choosing to support basic minimal functions necessary for a typicalstreaming application, the example embodiments allow the runtime system550 to be self-contained and lightweight (by using a singletopology-manager to manage the application topology 200). In contrast,other streaming runtime systems require multiple components for managingvarious functions within the topology.

The runtime system 550 can be implemented with a programming model thatallows users to write code once, and runtime system 550 canautomatically run the same code anywhere, for example, edge, server,cloud or a combination of these, and on, for example, Windows™, Linux™orAndroid™, etc.

Referring now to FIG. 6, a block diagram 600 of a procedure ofprocessing a spout/bolt request is illustratively depicted in accordancewith an embodiment of the present invention.

Runtime system 550 checks (tag destination? 602) if the request fromspout 210/bolt 220 is to obtain the address of the destination where thedata item/tuple belonging to tag (260) connection needs to be sent. Ifso, then the runtime system 550 returns the address of the destination(return tag destination 604) for the tag (and ends processing request606), if the address of the destination was already assigned. Otherwise,runtime system 550 identifies a new destination for the tag (if receivedfor the first time) and sends the newly assigned destination address.Spout(s) 210/bolt(s) 220 request the address only when the tag is seenfor the first time by the spout 210/bolt 220. Once the spout 210/bolt220 receives the address for a particular tag, the spout 210/bolt 220remembers/caches the address and uses the address whenever the same tagis repeated.

Runtime system 550 provides a way for spouts 210/bolts 220 to maintain acommon, in-memory, shared storage area (called global hash table 190),accessible to all instances of spout 210/bolt 220. If the requestreceived by the runtime system 550 is to Get/Retrieve an entry from theglobal hash table 190 (get global hash table entry? 608-YES), then theparticular entry is searched in the global hash table 190 and returned(return global hash table entry-610), if found (and ends processingrequest 612).

Runtime system 550 checks if the request from spout 210/bolt 220 is toset an entry in the global hash table (set global hash table entry?614). If so (YES), then the particular entry with its value is stored inthe global hash table by the runtime system 550 (set global hash tableentry 616 and end processing request 618), which can be accessed by anyother spout 210/bolt 220 within the topology.

Runtime system 550 checks if the request from spout 210/bolt 220 is toerase an entry from the global hash table (erase global hash tableentry? 620). If so, then the particular entry is removed from the globalhash table (erase global hash table entry 622 and end processing request624).

Runtime system 550 checks if the request received from a spout informsthe completion of the spout (and corresponding input task) (spoutcompleted? 626). If so (YES), then the runtime system first checks ifthe spout 210 is actually running (verify if spout is running, stop andremove spout 628). If the spout 210 is not running (No spout running?630-NO), then the runtime system 550 does not take any action andcompletes the request (at end processing request 624). If the spout 210is in running state (630-YES), then the runtime system 550 stops andremoves the spout 210 (628). After removal of the spout 210, the runtimesystem 550 checks if there are any other spouts 210 running. If so, thenprocessing the request is completed. If there is no other spout running,then the runtime system stops and removes all running bolts 632 (sincethere is nothing to be done by bolts 210, if there is no running spout210, which provides the input data for processing).

Runtime system 550 checks if the request received from spout 210/bolt220 is to tear down the topology (teardown topology 634). If so, thenthe runtime system 550 sends a “completed” signal to all the runningspouts 636. These spouts 210, when receive the “completed” signal, inturn, send back the request to the runtime system 550 that they havecompleted and the runtime system 550 follow the procedure from step 626to cleanly remove all running spouts 210 and bolts 220 and ultimatelytear down the complete topology. The runtime system 550 ends processingrequests 638.

Referring now to FIG. 7, a block diagram of a streaming platform device700 implementing a runtime system for real-time streaming applicationsis illustratively depicted in accordance with an embodiment of thepresent invention.

As shown in FIG. 7, streaming platform device can receive multiplestreams (illustrated as 705-1 to 705-m) associated with different or thesame streaming applications (for example, multiple video streams, audiostreams, data streams, etc.). An application, once written, can run onmultiple platforms and multiple computing architectures. The streamingplatform device 700 can invoke runtime system 550 to set up applicationtopologies (application topology 200-1 to 200-n) for each of thedifferent streaming applications based on the different platforms andcomputing architectures. Some spouts 210 and bolts 220 can be commonlyutilized by (or assigned to) the different application topologies 200. Aglobal hash table 190 can be used in implementing the applicationtopologies 200, where data can be shared across spouts 210 and bolts220, irrespective of their order in the application topology 200.

The streaming device 700 receives and processes the streamingapplications and outputs the stream to, for example, a rendering device710 with a graphical user interface 720. Applications can scale to useavailable resources on the edge, server or cluster/cloud. Although asingle streaming platform device 700 is illustrated, the system andmethods described herein also support efficient partitioning ofstreaming workload between edge-devices and centralized servers/cloud.

FIG. 8 is a block diagram illustrating a streaming system architecture800, in accordance with an embodiment of the present invention.

According to example embodiments, as shown in FIG. 8, streaming systemarchitecture 800 includes an application topology 802, a streamingapplication programming model (application programming interfaces(APIs)) 815, a shared library 820, topology name (component) 825,topology configuration (component) 830, a streaming runtime system 840,a single node deployment? (decision) 845, deploy application topologywithin single node 850, single node 855, deploy application topology asmultiple processes 860, and multiple nodes 865 (shown as nodes 865-1 to865-N by way of example). Although a particular configuration of thestreaming system architecture 800 is shown by way of non-limitingillustration, it should be understood that streaming system architecture800 can include additional or fewer components, arranged in differentmanners with respect to each other based on the principles describedherein.

Application topology 170 includes a logical entity consisting of spouts210, bolts 220, connection and communication 810 (between spouts andbolts), topology manager 805 and global hash table 190.

Topology manager 805 can manage: a. Creation and execution of spouts210; b. Creation and execution of bolts 220; c. Connection betweenspouts 210 and bolts 220 (for example, the different types of connectionas discussed herein above can be supported); d. Communication and datatransfer between spouts and bolts (depending on the type of connection);e. Providing in-memory Global Hash Table for exchanging data between anyspouts and bolts in the topology (irrespective of their order in thetopology); f. Termination of application topology.

Spouts 210 and bolts 220 can read and write into the global hash table190, thereby communicating information to other spouts 210 and bolts220.

The streaming programming model (application programming interfaces(APIs) 815 exposes APIs, such as the API described below, to specify theapplication topology 170 and compile it into a shared library 820.

APIs exposed by the programming model 815:

Create a new spout 210:

addSpout<Spout Class>(“spout-name”, “<one-or-more-spout-args>”,parallelism);

Create a new bolt 220:

addBolt<Bolt class>(“bolt-name”, “<one-or-more-bolt-args>”,parallelism);

Add shuffle connection:

addShuffleConnection(“<source-spout-or-bolt>”,“<destination-spout-or-bolt>”);

Add tag connection:

addTagConnection(“<source-spout-or-bolt>”,“<destination-spout-or-bolt>”, “tag”);

Add broadcast connection:

addBroadcastConnection(“<source-spout-or-bolt>”,“<destination-spout-or-bolt>”);

Setup application topology (170) (should be implemented by topologymanager (805)):

setup(int id, int parallelism, const std::vector<std::string> &args);

Initialization and Execution of spouts 210 and bolts 220 (should beimplemented by every spout 210 and bolt 220):

setup(int id, int parallelism, const std::vector<std::string> &args);(logic for one-time initialization)

execute( ); (execution logic for every input tuple)

Transfer data item/tuple from spout or bolt to successive bolt(s):

emit(<data item/tuple>);

To write in global hash table 190:

set(“key”, value);

To read from global hash table 190:

get(“key”);

To export special symbol:

Topology(“name”) (825);

The last API call enables the user to export a special symbol in theshared library 820, which is used by runtime system 840 to extract theapplication topology 170.

Once the user writes the application topology 170 using the above APIsexposed by the streaming programming model, the application topology 170is then compiled into a shared library 820 with a special symbolexported with the topology name 825. This shared library 820 is providedto the runtime system 840 at the time of execution of the applicationtopology 170. The shared library 820 is created for convenience, butalternative implementations can include a static library linked with theruntime system 840 and executed within a single binary.

Topology name 825 is provided to the streaming runtime system 840, whichis used to obtain a handle and extract the application topology 170 fromthe shared library 820. Any configuration parameters that need to bepassed to the application topology 170, including the node(s) on whichthe application topology 170 is to be run are provided to the streamingruntime system 840.

Streaming runtime system 840 obtains the shared library 820, topologyname 825 and topology configuration 830, including node(s) informationas part of the request to run the application topology.

The streaming runtime system 840 first checks whether the deploymentrequest is for single node or across multiple nodes (single nodedeployment? 845), by checking the provided node(s) information. Thesource code is the same, however the implementation can be different atruntime, thus providing flexibility of deployment.

If it is single node deployment (845-YES), the streaming runtime system840 obtains a handle and extracts the application topology 170 from theshared library 820 using the provided topology name 825. The logicalentity, e.g., application topology 170, at this point, is deployedwithin a single process on a single node 855 on bare-metal, e.g.,without any VM. Separate threads are created for the spout(s) 210,bolt(s) 220 and topology-manager 805, and topology-configuration 830 ispassed to the application topology 170. Spout(s) 210, bolt(s) 220 andtopology-manager 805 are implemented as threads of the process on node855.

If the deployment is for multiple nodes (845-NO), the runtime system 840obtains a handle and extracts the application topology 170 from theshared library 820 using the topology name 825. The logical entity,e.g., application topology 170, at this point, is deployed as multipleprocesses across multiple nodes 865 (shown as 865-1 to 865-N) onbare-metal, e.g., without any VM. Separate processes are created forspout(s) 210, bolt(s) 220 and topology manager 805, andtopology-configuration 830 is passed to the application topology.Spout(s) 210, bolt(s) 220 and topology-manager 805 are implemented asprocesses distributed across multiple nodes 865-1 to 865-N.

According to example embodiments, a hybrid deployment of the applicationtopology can be implemented such that within a single node, there can besingle process with multiple threads, while across nodes, there can beprocesses (each with multiple threads), thereby forming a hybridapplication topology.

The foregoing is to be understood as being in every respect illustrativeand exemplary, but not restrictive, and the scope of the inventiondisclosed herein is not to be determined from the Detailed Description,but rather from the claims as interpreted according to the full breadthpermitted by the patent laws. It is to be understood that theembodiments shown and described herein are only illustrative of theprinciples of the present invention and that those skilled in the artmay implement various modifications without departing from the scope andspirit of the invention. Those skilled in the art could implementvarious other feature combinations without departing from the scope andspirit of the invention. Having thus described aspects of the invention,with the details and particularity required by the patent laws, what isclaimed and desired protected by Letters Patent is set forth in theappended claims.

What is claimed is:
 1. A method for specifying and compiling real-timestreaming applications, comprising: specifying an application topologyfor an application including at least one spout, at least one bolt, atleast one connection, a global hash table, and a topology manager,wherein each at least one spout receives input data and each at leastone bolt transforms the input data, the global hash table allows inmemory communication between each of the at least one spout and the atleast one bolt to others of the at least one spout and the at least onebolt, and the topology manager manages the application topology;compiling the application into a shared or static library forapplications, wherein the runtime system can be used to run theapplication topology on a single node or distribute across multiplenodes; and exporting a special symbol associated with the application,wherein the application and the application topology are configured tobe retrieved from the shared or static library for execution based onthe special symbol.
 2. The method as recited in claim 1, furthercomprising, executing the application topology by: determining atruntime, based on node information in a runtime-request, whether todeploy and execute the application topology within a single processconsisting of multiple threads on the single node or deploy and executethe application topology using multiple processes distributed across themultiple nodes. retrieving the application topology from the shared orstatic library for execution based on the special symbol; usinglow-level, topology-aware, inter-process communication mechanism betweenspouts and bolts; and processing at least one request based on the atleast one spout, the at least one bolt and the at least one connection.3. The method as recited in claim 2, further comprising: after creationof the application topology, checking if any spout is present in theapplication topology, if there are no spouts present in the applicationtopology, determining that there is no input data and exiting theapplication topology; and if there are spouts present in the applicationtopology, calculating and creating a total number of executors, whereinthe total number of executors corresponds to a total number ofspout/bolt instances to be created.
 4. The method as recited in claim 3,further comprising: after the executors are created, assigning each taskassociated with each spout/bolt instance to an executor; starting thebolt instances; and following by starting all the spout instances. 5.The method as recited in claim 2, wherein determining and executing theapplication topology further comprises: determining the applicationtopology based on a programming model; and enabling execution of a samecode on multiple platforms using the application topology.
 6. The methodas recited in claim 2, further comprising: collecting metrics whenever atime interval between a previous collection and a current time hasreached or exceeded a pre-defined time limit; and monitoring a status ofexecutors if no request is received.
 7. The method as recited in claim2, further comprising: checking if a request is to obtain an address ofa destination where a data item belonging to a tag is to be sent; if therequest is to obtain the address, returning the address of thedestination for the tag if the address was previously assigned; and ifthe address was not previously assigned, assigning a new destination forthe tag and sending a newly assigned destination address.
 8. The methodas recited in claim 2, further comprising: implementing the applicationtopology as a hybrid topology that includes at least one single processwith multiple threads on a single node and at least a plurality ofprocesses that each have multiple threads across a plurality of nodes.9. The method as recited in claim 1, wherein the at least one spoutincludes at least one of: a time out spout that invokes and emits auser-defined data item at a periodic time interval; an asynchronousmessaging tuple receiver spout that receives a data item over anasynchronous messaging queue; an asynchronous messaging video receiverspout that receives and decodes data items containing video frames overan asynchronous messaging queue; and a user-defined spout providinglogic within setup( ) and execute( ) functions.
 10. The method asrecited in claim 1, wherein the at least one bolt includes at least oneof: an asynchronous messaging tuple publisher bolt that publishes atleast one data item over an asynchronous messaging queue; a filter boltthat filters particular data items and conditionally emits theparticular data items to successive bolts in a chain, based on at leastone condition specified in a filter; a typed bolt that can be used forspecific input data type and output data type; a tuple windowing boltthat can be used when data items are to be aggregated over a particularwindow size and emitted to the successive bolts in the chain; and auser-defined bolt providing logic within setup( ) and execute( )functions.
 11. The method as recited in claim 1, wherein the at leastone connection includes at least one of: a shuffle connection that takesa tuple from a producer and sends the tuple to a randomly chosenconsumer; a tag connection that allows a user to control how tuples aresent to bolts based on at least one tag in the tuple; and a broadcastconnection that sends the tuple to all instances of all receiving bolts.12. A computer system for specifying and compiling real-time streamingapplications, comprising: a processor device operatively coupled to amemory device, the processor device being configured to: specify anapplication topology for an application including at least one spout, atleast one bolt, at least one connection, a global hash table, and atopology manager, wherein each at least one spout receives input dataand each at least one bolt transforms the input data, the global hashtable allows in memory communication between each of the at least onespout and the at least one bolt to others of the at least one spout andthe at least one bolt, and the topology manager manages the applicationtopology; compile the application into a shared or static library forapplications, wherein the runtime system can be used to run theapplication topology on a single node or distribute across multiplenodes; and export a special symbol associated with the application,wherein the application and the application topology are configured tobe retrieved from the shared or static library for execution based onthe special symbol.
 13. The system as recited in claim 12, wherein theprocessor device, during execution of the application topology, isfurther configured to: determine at runtime, based on node informationin a runtime-request, whether to deploy and execute the applicationtopology within a single process consisting of multiple threads on thesingle node or deploy and execute the application topology usingmultiple processes distributed across the multiple nodes. retrieve theapplication topology from the shared or static library for executionbased on the special symbol; use low-level, topology-aware,inter-process communication mechanism between spouts and bolts; andprocess at least one request based on the at least one spout, the atleast one bolt and the at least one connection.
 14. The system asrecited in claim 13, wherein the processor device is further configuredto: after creation of the application topology, check if any spout ispresent in the application topology; if there are no spouts present inthe application topology, determine that there is no input data andexiting the application topology; and if there are any spouts present inthe application topology, calculate and create a total number ofexecutors, wherein the total number of executors corresponds to a totalnumber of spout/bolt instances to be created.
 15. The system as recitedin claim 13, wherein the processor device is further configured to:after the executors are created, assign each task associated with eachspout/bolt instance to an executor; start the bolt instances; and followby starting all the spout instances.
 16. The system as recited in claim13, wherein, when determining and executing the application topology,the processor device is further configured to: determine the applicationtopology based on a programming model; and enable execution of a samecode on multiple platforms using the application topology.
 17. Thesystem as recited in claim 12, wherein the at least one spout includesat least one of: a time out spout that invokes and emits a user-defineddata item at a periodic time interval; an asynchronous messaging tuplereceiver spout that receives a data item over an asynchronous messagingqueue; an asynchronous messaging video receiver spout that receives anddecodes data items containing video frames over an asynchronousmessaging queue; and a user-defined spout providing logic within setup() and execute( ) functions.
 18. The system as recited in claim 12,wherein the at least one bolt includes at least one of: an asynchronousmessaging tuple publisher bolt that publishes at least one data itemover an asynchronous messaging queue; a filter bolt that filtersparticular data items and conditionally emits the particular data itemsto successive bolts in a chain, based on at least one conditionspecified in a filter; a typed bolt that can be used for specific inputdata type and output data type; a tuple windowing bolt that can be usedwhen data items are to be aggregated over a particular window size andemitted to the successive bolts in the chain; and a user-defined boltproviding logic within setup( ) and execute( ) functions.
 19. The systemas recited in claim 12, wherein the at least one connection includes atleast one of: a shuffle connection that takes a tuple from a producerand sends the tuple to a randomly chosen consumer; a tag connection thatallows a user to control how tuples are sent to bolts based on at leastone tag in the tuple; and a broadcast connection that sends the tuple toall instances of all receiving bolts.
 20. A computer program product forspecifying and compiling real-time streaming applications, the computerprogram product comprising a non-transitory computer readable storagemedium having program instructions embodied therewith, the programinstructions executable by a computing device to cause the computingdevice to perform the method comprising: specifying an applicationtopology for an application including at least one spout, at least onebolt, at least one connection, a global hash table, and a topologymanager, wherein each at least one spout receives input data and each atleast one bolt transforms the input data, the global hash table allowsin memory communication between each of the at least one spout and theat least one bolt to others of the at least one spout and the at leastone bolt, and the topology manager manages the application topology;compiling the application into a shared or static library forapplications, wherein the runtime system can be used to run theapplication topology on a single node or distribute across multiplenodes; and exporting a special symbol associated with the application,wherein the application and the application topology are configured tobe retrieved from the shared or static library for execution based onthe special symbol.